2d Text Visualization for the Retrieval of Malay Documents

نویسندگان

  • NORMALY KAMAL ISMAIL
  • TENGKU MOHD TENGKU SEMBOK
چکیده

Search engine applications like Google and Yahoo present their results in the form of onedimensional linear list that usually comprise three times of the screen size per page and several number of pages. The results are displayed in the list of inconsistent declining ranks without displaying its rank values. The one-dimensional linear list display of the results data will cause classification of the results data meaningless. New queries relating to the original query are available, but its relationship strength values are not provided An application that can display all the result data in a two-dimensional text visualization within one page and circular form is proposed. The relationship strength of the result data with the query can be evaluated by finding the distance between the location of the result data to the center of the circle. Classifications that are made in the form of text and color can easily apply to the application. Malay translated Al-Quran and Malay translated hadith are used as corpuses for the application. Three functions in the application display the relationship between words and words, between words and documents, and between documents and documents. Various combinations of formulas can be used to find the values of these relationships that will be used as the rank values in the application. This, two-dimensional text visualization (TDTV), application is evaluated using two mechanisms. First, by solving a task and then, follow by answering the usability questionnaire. The results from the task section show that the variety of related documents can be retrieved in a reasonable time frame. The results from the usability questionnaire show about 75 percent of the respondents agree that the two-dimensional text visualization (TDTV) application is better than applications that display its results in one-dimensional linear list. Key-Words: Information retrieval, visualization, classification, web-based, usability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

دیداری کردن نتایج جست‌وجو در فرایند بازیابی اطلاعات

Purpose: One of the most effective ways to achieve optimum information retrieval is through visualization of Information. Search strategies, probing skills, querying of information needs and analysis of information play a significant role in the accessing of necessary and useful information. Besides the factors mentioned above, information visualization can increase the availability level of in...

متن کامل

بررسی نقش انواع بافتار هم‌نویسه‌ها در تعیین شباهت بین مدارک

Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...

متن کامل

Towards a Topic Driven Access to Full Text Documents

We address the issue of providing a topic driven access to full text documents. The methodology we propose is a combination of topic segmentation and information retrieval techniques. By segmenting the text into topic driven segments, we obtain small and coherent documents that can be used as a basis for the automatic generation of links, and as a visualization aid for the reader who is present...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012